Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Over the past decade, deep reinforcement learning (RL) techniques have significantly advanced robotic systems. However, due to the complex architectures of neural network models, ensuring their trustworthiness is a considerable challenge. Programmatic reinforcement learning has surfaced as a promising approach. Nonetheless, synthesizing robot-control programs remains challenging. Existing methods rely on domain-specific languages (DSLs) populated with user-defined state abstraction predicates and a library of low-level controllers as abstract actions to boot synthesis, which is impractical in unknown environments that lack such predefined components. To address this limitation, we introduce RoboScribe, a novel abstraction refinement-guided program synthesis framework that automatically derives robot state and action abstractions from raw, unsegmented task demonstrations in high-dimensional, continuous spaces. It iteratively enriches and refines an initially coarse abstraction until it generates a task-solving program over the abstracted robot environment. RoboScribe is effective in synthesizing iterative programs by inferring recurring subroutines directly from the robot’s raw, continuous state and action spaces, without needing predefined abstractions. Experimental results show that RoboScribe programs inductively generalize to long-horizon robot tasks involving arbitrary numbers of objects, outperforming baseline methods in terms of both interpretability and efficiency.more » « lessFree, publicly-accessible full text available October 1, 2026
-
We propose a deductive synthesis framework for construct- ing reinforcement learning (RL) agents that provably satisfy temporal reach-avoid specifications over infinite horizons. Our approach decomposes these temporal specifications into a sequence of finite-horizon subtasks, for which we synthesize individual RL policies. Using formal verification techniques, we ensure that the composition of a finite number of subtask policies guarantees satisfaction of the overall specification over infinite horizons. Experimental results on a suite of benchmarks show that our synthesized agents outperform standard RL methods in both task performance and compliance with safety and temporal requirements.more » « lessFree, publicly-accessible full text available July 21, 2026
-
Abstract We propose a deductive synthesis framework for constructing reinforcement learning (RL) agents that provably satisfy temporal reach-avoid specifications over infinite horizons. Our approach decomposes these temporal specifications into a sequence of finite-horizon subtasks, for which we synthesize individual RL policies. Using formal verification techniques, we ensure that the composition of a finite number of subtask policies guarantees satisfaction of the overall specification over infinite horizons. Experimental results on a suite of benchmarks show that our synthesized agents outperform standard RL methods in both task performance and compliance with safety and temporal requirements.more » « lessFree, publicly-accessible full text available July 23, 2026
-
We introduce VELM, a reinforcement learning (RL) framework grounded in verification principles for safe exploration in unknown environments. VELM ensures that an RL agent systematically explores its environment, adhering to safety properties throughout the learning process. VELM learns environment models as symbolic formulas and conducts formal reachability analysis over the learned models for safety verification. An online shielding layer is then constructed to confine the RL agent’s exploration solely within a state space verified as safe in the learned model, thereby bolstering the overall safety profile of the RL system. Our experimental results demonstrate the efficacy of VELM across diverse RL environments, highlighting its capacity to significantly reduce safety violations in comparison to existing safe learning techniques, all without compromising the RL agent’s reward performance.more » « less
-
Deep reinforcement learning (RL) has led to encouraging successes in numerous challenging robotics applications. However, the lack of inductive biases to support logic deduction and generalization in the representation of a deep RL model causes it less effective in exploring complex long-horizon robot-control tasks with sparse reward signals. Existing program synthesis algorithms for RL problems inherit the same limitation, as they either adapt conventional RL algorithms to guide program search or synthesize robot-control programs to imitate an RL model. We propose ReGuS, a reward-guided synthesis paradigm, to unlock the potential of program synthesis to overcome the exploration challenges. We develop a novel hierarchical synthesis algorithm with decomposed search space for loops, on-demand synthesis of conditional statements, and curriculum synthesis for procedure calls, to effectively compress the exploration space for long-horizon, multi-stage, and procedural robot-control tasks that are difficult to address by conventional RL techniques. Experiment results demonstrate that ReGuS significantly outperforms state-of-the-art RL algorithms and standard program synthesis baselines on challenging robot tasks including autonomous driving, locomotion control, and object manipulation. CCS Concepts: •Software and its engineering → Automatic programming.more » « less
-
Decreased dendritic spine density in the cortex is a key pathological feature of neuropsychiatric diseases including depression, addiction, and schizophrenia (SCZ). Psychedelics possess a remarkable ability to promote cortical neuron growth and increase spine density; however, these compounds are contraindicated for patients with SCZ or a family history of psychosis. Here, we report the molecular design and de novo total synthesis of (+)-JRT, a structural analogue of lysergic acid diethylamide (LSD) with lower hallucinogenic potential and potent neuroplasticity-promoting properties. In addition to promoting spinogenesis in the cortex, (+)-JRT produces therapeutic effects in behavioral assays relevant to depression and cognition without exacerbating behavioral and gene expression signatures relevant to psychosis. This work underscores the potential of nonhallucinogenic psychoplastogens for treating diseases where the use of psychedelics presents significant safety concerns.more » « lessFree, publicly-accessible full text available April 22, 2026
-
Abstract Proper regulation of organelle dynamics and inter-organelle contacts is critical for cellular health and function. Both the endoplasmic reticulum (ER) and actin cytoskeleton are known to regulate organelle dynamics, but how, when, and where these two subcellular components are coordinated to control organelle dynamics remains unclear. Here, we show that ER-associated actin consistently marks mitochondrial, endosomal, and lysosomal fission sites. We also show that actin polymerization by the ER-anchored isoform of the formin protein INF2 is a key regulator of the morphology and mobility of these organelles. Together, our findings establish a mechanism by which INF2-mediated polymerization of ER-associated actin at ER-organelle contacts regulates organelle dynamics.more » « less
-
We present a verification-based learning framework VEL that synthesizes safe programmatic controllers for environments with continuous state and action spaces. The key idea is the integration of program reasoning techniques into controller training loops. VEL performs abstraction-based program verification to reason about a programmatic controller and its environment as a closed-loop system. Based on a novel verification-guided synthesis loop for training, VEL minimizes the amount of safety violation in the proof space of the system, which approximates the worst-case safety loss, using gradient-descent style optimization. Experimental results demonstrate the substantial benefits of leveraging verification feedback for synthesizing provably correct controllers.more » « less
An official website of the United States government

Full Text Available